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Abstract — In Linear Programming (LP) decoding of a Low- 
Density-Parity-Check (LDPC) code one minimizes a linear func- 
tional, with coefficients related to log-likelihood ratios, over a 
relaxation of the polytope spanned by the codewords [1]. In order 
to quantify LP decoding, and thus to describe performance of 
the error-correction scheme at moderate and large Signal-to- 
Noise-Ratios (SNR), it is important to study the relaxed polytope 
to understand better its vertexes, so-called pseudo-codewords, 
especially those which are neighbors of the zero codeword. In this 
manuscript we propose a technique to heuristically create a list of 
these neighbors and their distances. Our pseudo-codeword-search 
algorithm starts by randomly choosing the initial configuration 
of the noise. The configuration is modified through a discrete 
number of steps. Each step consists of two sub-steps. Firstly, 
one applies an LP decoder to the noise-configuration deriving a 
pseudo-codeword. Secondly, one finds configuration of the noise 
equidistant from the pseudo codeword and the zero codeword. 
The resulting noise configuration is used as an entry for the next 
step. The iterations converge rapidly to a pseudo-codeword neigh- 
boring the zero codeword. Repeated many times, this procedure is 
characterized by the distribution function (frequency spectrum) 
of the pseudo-codeword effective distance. The effective distance 
of the coding scheme is approximated by the shortest distance 
pseudo-codeword in the spectrum. The efficiency of the procedure 
is demonstrated on examples of the Tanner [155, 64, 20] code 
and Margulis p = 7 and p — 11 codes (672 and 2640 bits long 
respectively) operating over an Additive-White-Gaussian-Noise 
(AWGN) channel. 

Index Terms — LDPC codes, Linear Programming Decoding, 
Error-floor 



I. Introduction I: LDPC codes, Belief Propagation 
and Linear Programming 

We consider an LDPC code (cf. Gallager [2]) defined 
by some sparse parity-check matrix, H = {H a i\a = 
1, • • • , M; i = 1, • • • , N}, of size M x N. A codeword 
<7 = {di = 0,1; i = 1,...,N} satisfies all the check 
constraints: Va = 1, . . . , M, ^ Hai^i = (mod 2). We 
discuss the practical case of finite N and M, as opposed to 
the N, M — ► oo (thermodynamic) limit for which Shannon 
capacity theorems were formulated [3]. The codeword is sent 
over a noisy channel. To make our consideration concrete, we 
consider the AWGN channel. (Notice that all the discussions 
and results of this paper can be easily generalized to other lin- 
ear channel models.) Corruption of a codeword in the AWGN 
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channel is described by the following transition probability: 



V(x\(t) oc Yl cxp [-2s 2 (x t - a,) 2 ] 



(1) 



where x is the signal measured at the channel output and 
2s 2 is the Signal-to-Noise Ratio (SNR) of the code, that is 
traditionally denoted as E c /Nq. The Maximum Likelihood 
(ML) decoding corresponding to the restoration of the most 
probable pre-image cr' given the output signal x, 



argmaxP (x\cr'), 



(2) 



is not feasible in reality since its complexity grows exponen- 
tially with the system size. 

LP decoding was introduced by Feldman, Wainwright and 
Karger [ 1 ] as a computationally efficient approximation to the 
ML decoding. Following [1], let us first notice that Eq. (O 
can be restated for the AWGN channel as calculating 



arg min ( ^(1 - 2x l )a' l 



(3) 



where P is the polytope spanned by the codewords. Looking 
for cr' in terms of a linear combination of all possible 
codewords of the code, cr v : cr 1 = J>\ ^v&v, where X v > 
and XL A„ = 1' one finds that ML turns into a linear 
optimization problem. LP decoding proposes to relax the 
polytope, expressing cr' in terms of a linear combination of 
the so-called local codewords, i.e. codewords of trivial codes, 
each associated with just one check of the original code and 
all the variable nodes connected to it. We will come to the 
formal definition of the LP decoding [1], [7], [8], [9] later after 
discussing the Belief Propagation (BP) decoding of Gallager 
[2], [4], [5], [6]. 

The belief-propagation (BP), or sum-product, algorithm of 
Gallager [2] (see also [4], [5], [6]) is a popular iterative 
scheme often used for decoding of the LDPC codes. For an 
idealized code containing no loops (i.e., there is a unique path 
connecting any two bits through a sequence of other bits and 
their neighboring checks), the sum-product algorithm (with 
sufficient number of iterations) is exactly equivalent to the 
so-called Maximum-A-Posteriori (MAP) decoding, which is 
reduced to ML in the asymptotic limit of infinite SNR. For any 
realistic code (with loops), the sum-product algorithm is ap- 
proximate, and it should actually be considered as an algorithm 
for solving iteratively certain nonlinear equations, called BP 
equations. The BP equations minimize the so-called Bethe free 
energy [10]. (The Bethe free energy approach originates from a 
variational methodology developed in statistical physics [11], 
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[12].) Minimizing the Bethe free energy, that is a nonlinear 
function of the probabilities/beliefs, under the set of linear 
(compatibility and normalizability) constraints, is generally a 
difficult task. 

BP decoding becomes LP decoding in the asymptotic limit 
of infinite SNR. Indeed in this special limit, the entropy terms 
in the Bethe free energy can be neglected and the problem 
becomes minimization of a linear function under a set of linear 
constraints. The similarity between LP and BP (the latter one 
being understood as minimizing the Bethe Free energy [10]) 
was first noticed in [1] and it was also discussed in [7], [8], 
[9]. Stated in terms of beliefs, i.e. trial marginal probabilities, 
LP decoding minimizes the Bethe self-energy: 

B = ^^& a K)^ ( r i (l-2^)/fc i) (4) 

with respect to beliefs b a (a a ) and under certain equality and 
inequality constraints. Here in Eq. fc, is the degree of 
connectivity of the i-th bit; a a is a local codeword, a a = 
s a, J2i H a i<Ji = (mod 2)}, associated with the check 
a. The equality constraints are of two types, normalization 
constraints (beliefs, as probabilities, should sum to one) and 
compatibility constraints 

V«: ^b a (a a ) = l, (5) 
ViVa3i: hfc) = b a (a Q ), (6) 

a a \(7i 

respectively where bi(ai) is the belief (trial marginal probabil- 
ity) to find bit i in the state <Ji, and the check belief, 6 Q (er Q ), 
stands for the trial marginal probability of finding bits, which 
are neighbors of the check a, in the state <r a . Also, all the 
beliefs, as probabilities, should be non-negative and smaller 
than or equal to unity. Thus there is the additional set of the 
inequality constraints: 

0<h(ai),b a (a a ) <1. (7) 

II. Introduction II: Pseudo codewords, Frame 
Error Rate and effective distance 

As it was shown in [1] the LP decoding has ML certificate, 
i.e. if the pseudo-codeword obtained by the LP decoder has 
only integral entries then it must be a codeword, in fact it 
is the codeword given back by ML decoder. If LP decoding 
does not decode to a correct codeword then it usually yields 
a non-codeword pseudo-codeword with some number of non- 
integers among the beliefs bi and b a . These configurations 
can be interpreted as mixed state configurations consisting of 
a probabilistic mixture of local codewords. 

An important characteristic of the decoding performance 
is Frame Error Rate (FER) calculating the probability of 
decoding failure. FER decreases as SNR increases. The form 
of this dependence gives an ultimate description of the cod- 
ing performance. Any decoding to a non-codeword pseudo- 
codeword is a failure. Decoding to a codeword can also be 
a failure, which counts as a failure under ML decoding. For 
large SNR, i.e. in the so-called error-floor domain, splitting of 
the two (FER vs SNR) curves, representing the ML decoding 



and an approximate decoding (say LP decoding) is due to 
pseudo-codewords [13]. The actual asymptotics of the two 
curves for the AWGN channel are FERml ~ exp(— cIml-s 2 /2) 
and FERlp ~ exp(— c?lp ■ s 2 /2), where c?ml is the so-called 
Hamming distance of the code and the d^p is the effective 
distance of the code, specific for the LP decoding. The LP 
error-floor asymptotic is normally shallower than the ML 
one, c?lp < ^ml- The error floor can start at relatively low 
values of FER, unaccessible for Monte-Carlo simulations. This 
emphasizes importance of the pseudo-codewords analysis. 

For a generic linear code performed over a symmetric 
channel, it is easy to show that the FER is invariant under 
the change of the original codeword (sent into the channel). 
Therefore, for the purpose of FER evaluation, it is sufficient to 
analyze the statistic exclusively for the case of one codeword, 
and the choice of zero codeword is natural. Then calculating 
the effective distance of a code, one makes an assumption 
that there exists a special configuration (or maybe a few 
special configurations) of the noise, instantons according to 
the terminology of [14], describing the large SNR error- 
floor asymptotic for FER. Suppose a pseudo codeword, <r = 
{<7j = i — 1,...,N}, corresponding to the most 

damaging configuration of the noise (instanton), :Ej nst , is found. 
Then finding the instanton configuration itself (i.e. respective 
configuration of the noise) is equivalent to maximizing the 
transition probability (03 with respect to the noise field, x, 
taken at er = under the condition that the self-energy 
calculated for the pseudo-codeword in the given noise field 
x is zero (i.e. it is equal to the value of the self energy for 
the zero code word). The resulting expression for the optimal 
configuration of the noise (instanton) is 

_ z Ei <h m 

and the respective effective distance is 

d LP = %5^. (9) 

This definition of the effective distance was first described 
in [15], with the first applications of this formula to the LP 
decoding discussed in [7] and [9]. Note also that Eqs. ( 18l9t 
are reminiscent of the formulas derived by Wiberg and co- 
authors in [16] and [17], in the context of the computational 
tree analysis applied to iterative decoding with a finite number 
of iterations. 

III. Searching for pseudo-codewords 

In this Section, we turn directly to describing an algorithm 
which allows one to find efficiently pseudo-codewords of an 
LDPC code performing over AWGN channel and decoded 
by LP. Once the algorithm is formulated, its relation to 
the introductory material, as well as partial justification and 
motivation will become clear. 

The Pseudo-codeword search algorithm: 

• Start: Initiate a starting configuration of the noise, x^ . 
Noise measures a deviation from the zero codeword and 
it should be sufficiently large to guarantee convergence 
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Fig. 1 . Schematic illustration of the pseudo-codeword-search algorithm. This 
example terminates at fc* = 3. The point 1/2 = (1, ■ ■ ■ , l)/2 is shown 
to illustrate that if one draws a straight line through 1/2, such that it is 
perpendicular to the straight line connecting = (0, ■ ■ ■ ,0) and cr^ k \ then 
the straight line must go e-approximately through x^ k+ K [We are thankful 
to Referee A for making this useful 1 / 2-related observation.] 



of LP to a pseudo-codeword different from the zero 
codeword. 

Step 1: The LP decoder finds the closest pseudo- 
codeword, rr^ k \ for the given configuration of the noise 



LP,fe) / 



= argmin < E 

{6i(<r«)W<*)} 



(xW;{bi(ai),b a (a a )} 



satisfying Eqs. ( |5l6|7t j, 

^=6^(1), 

where the self-energy is defined according to Eq. 

In the case of degeneracy one picks any of the closest 

pseudo-codewords. 

Step 2: Find yW, the weighted median in the noise 
space between the pseudo codeword, <j( k \ and the zero 
codeword: 



y 



Jjk) 



• Step 3: If y( k > — y^" 1 ), then fc* = k and the algorithm 
terminates. Otherwise go to Step 2 assigning x^ k+1 ^ = 
y^ k > + e for some very small e. (+e prevents decoding 
into the zero codeword, keeping the result of decoding 
within the erroneous domain.) 
y( fc ») i s the output configuration of the noise that belongs 
to the error-surface surrounding the zero codeword. (The error- 
surface separates the domain of correct LP decisions from the 
domain of incorrect LP decisions.) Moreover, locally, i.e. for 
the given part of the error-surface equidistant from the zero 
codeword and the pseudo codeword cr( k *>, is the nearest 
point of the error-surface to the zero codeword. 

The algorithm is schematically illustrated in Fig. [T] We 
repeat the algorithm many times picking the initial noise 
configuration randomly, however guaranteeing that it would 



be sufficiently far from the zero codeword so that the result 
of the LP decoding (first step of the algorithm) is a pseudo- 
codeword distinct from the zero codeword. Our simulations 
(see discussions below) show that the algorithm converges, 
and it does so in a relatively small number of iterations. The 
convergence of the algorithm is translated into the statement 
that the effective distance between ajW and the zero codeword 
does not increase, but typically decreases, with iterations. 
Once the algorithm converges the resulting pseudo-codeword 
belongs to the error-surface. This observation was tested by 
shifting the instanton configuration of the noise correspondent 
to the pseudo-codeword towards the zero codeword and ob- 
serving that the result of decoding is the zero codeword. The 
effective distance of the coding scheme is approximated by 

2 



1LP 




(10) 



where the minimum is taken over multiple evaluations of the 
algorithm. It is not guaranteed that the noise configuration with 
the lowest possible (of all the pseudo-codewords within the 
decoding scheme) distance is found after multiple evaluations 
of the algorithm. Also, we do not have a formal proof of 
the fact that, beginning with a random x^°\ our algorithm 
explores the entire phase space of all pseudo-codewords on 
the error-surface. However our working conjecture is that the 
rhs of Eq. ( TTOb gives a very tight (if the number of attempts 
is sufficient) upper bound on the actual effective distance of 
the coding scheme. 

IV. Examples 

In this Section, we demonstrate the power of the simple 
procedure explained in the previous Section by considering 
three popular examples of relatively long regular LDPC codes. 

A. The Tanner [155,64,20] code of [18] 

For this code N = 155 and M = 93. The Hamming 
distance of the code is known to be g?ml = 20. The authors 
of [7] reported a pseudo codeword with d = 16.406. The 
lowest effective distance configuration found as a result of 
our search procedure is c?lp ~ 16.4037. These two, and 
some number of other lower lying (in the sense of their 
effective distance) configurations, are shown in Fig. The 
resulting frequency spectra (derived from 3, 000 evaluations 
of the pseudo-codeword-search algorithm) is shown in Fig. [3] 
Notice that the pseudo-weight spectrum gap, defined as the 
difference between the pseudo-weight of the non-codeword 
minimal pseudo-codeword with smallest pseudo-weight and 
the minimum distance [19], is negative for the code, w 
—3.5963. Thus the LP decoding performance is strictly worse 
than the ML decoding performance for SNR — > oo. 

B. The Margulis code [20] with p = 7 

This code has N = 2 • M = 672 bits. The set of 
four noise configurations with the lowest effective distance 
found by the pseudo-codeword-search algorithm for the code 



TO APPEAR IN IEEE TRANSACTIONS ON INFORMATION THEORY, 2007 



4 



16.4145 























16.4119 
















* • • . . . • 




• • 


16.4113 


a 






• • 




f ■ 








- ' . 


16.4095 














' . - • 






. \ M_.a_* _ 


16.4060 














* '•it-. 






. , - 


16.4058 






















16.4044 




















.• • • • _ 


16.4037 












• • 










1 






1 

50 

bit label, i = 


1 

100 

0, 154 


i 

L50 



Fig. 2. The 8 lowest configurations found by the pseudo-codeword-search 
algorithm for the [155, 64, 20] code. The typical number of evaluations 
required to reach a stopping point is 5 -r 15. 
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Fig. 3. The frequency spectrum (distribution function) of the effective 
distance constructed from 3,000 attempts of our pseudo-codeword search 
algorithm for the [155, 64, 20] code. 

is shown in Fig. [4] The lowest configuration decodes into 
a codeword with the Hamming distance 16. A large gap 
separates this configuration from the next lowest configuration 
corresponding to a pseudo-codeword that is not a codeword. 
Since the pseudo-weight spectrum gap is positive in this case, 
the LP decoding approaches the ML decoding performance 
for SNR — > oo. The frequency spectra, characterizing the 
performance of the pseudo-codeword-search algorithm for this 
code, is shown in Fig. [5] 

C. The Margulis code [20] with p = 11 

This code is N = 2 • M = 2640 bits long. We have a 
relatively small number of configurations (30) here because 
it takes much longer to execute the LP decoding in this 
case. Some 30 to 60 steps of the pseudo-codeword search 
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Fig. 4. The 4 lowest noise configuration found by our pseudo-codeword 
search algorithm for the Margulis p = 7 code of [20]. The typical number of 
evaluations required to reach a stopping point is in between 10 and 20. 
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Fig. 5. The frequency spectrum (distribution function) of the effective 
distance found through multiple attempts of the pseudo-codeword-search 
algorithm for the Margulis p = 7 code. The figure is built on 250 evaluations 
of the pseudo-codeword-search algorithm. 

are required for a typical realization of the algorithm to reach 
a stopping point. The four lowest configurations are shown 
in Fig. [6] Obviously, with limited statistics one cannot claim 
that the noise configuration with the lowest possible effective 
length has been found. All stopping point configurations found 
here correspond to pseudo codewords. (The Hamming distance 
for this code is not known, while the pessimistic upper bound 
mentioned in [21] is 220.) 

V. Conclusions and Discussions 

Let us discuss the utility of the pseudo-codeword search 
algorithm proposed in this manuscript. The algorithm gives 
an efficient way of describing the LP decoding polytope and 
the pseudo-codeword spectra of the code. It approximates the 
pseudo-codeword and the respective noise configuration on the 
error-surface surrounding the zero codeword, corresponding to 
the shortest effective distance of the code. Our test shows that 
the algorithm converges very rapidly. (Even for the 2640 bits 
code, the longest code we considered, it typically takes only 
30 to 60 steps of the pseudo-codeword search algorithm to 
converge.) As already mentioned, this procedure applies to 
any linear channel. One only needs to make modifications in 
Eqs. (1819110b and also in the basic equation of Step 2. 
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Fig. 6. The 4 lowest noise configurations found by our pseudo-codeword 
search algorithm for the Margulis p = 11 code of [20]. The typical number 
of the pseudo-codeword-search iterations required to reach a stopping point 
is in between 30 and 60. 



One would obviously be interested in extending the pseudo- 
codeword search algorithm to other decodings, e.g. to find the 
effective minimal distance of the sum-product decoding. We 
observed, however, that a naive extension of this procedure 
does not work. The very special feature of the LP-case is that 
the noise configuration found as a weighted median of the 
zero codeword and a pseudo codeword (+e, as in the Step 3 
of the pseudo-codeword search algorithm) is not decoded into 
the zero codeword. This allows us to proceed with the search 
algorithm always decreasing the effective distance or at least 
keeping it constant. It is not yet clear if this key feature of the 
LP decoding is extendable (hopefully with some modification 
of the weighted median procedure) to iterative decoding. This 
question requires further investigation. 

Even though the direct attempt to extend the LP-based 
pseudo-codewords-search algorithm to the sum-product de- 
coding failed, we still found an indirect way of using these 
LP results to analyze the sum-product decoding. The most 
damaging configuration of the noise found within the pseudo- 
codeword-search procedure becomes a very good entry point 
for the instanton-amoeba method of [14], designed for finding 
instanton configurations (most damaging configurations of the 
noise) for the case of the standard iterative decoding. This hy- 
brid method works well, sometimes resulting in the discovery 
of pseudo-codewords (of the respective iterative scheme) with 
impressively small effective distance. We attribute this fact to 
the close relation existing between the LP decoding and the 
BP decoding [1], [7], [8], [9]. Some preliminary results of this 
hybrid analysis are discussed in [22]. Summarizing, the LP- 
based pseudo-codeword search algorithm, complemented and 
extended by the instanton-amoeba method of [14], provides 
an efficient practical tool for the analysis of effective dis- 
tances, most damaging configurations of the noise (instantons) 
describing the error-floor, and their frequency spectra for an 
arbitrary LDPC code performing over a linear channel and 
decoded by LP decoding or iteratively. 

After the original version of the manuscript was submitted 
for publication, we have learned about some important new 
results concerning reducing complexity of LP-decoding [23], 
[24]. It is also appropriate to mention here the most recent pub- 



lications exploring possibilities of LP-decoding improvement 
[25], [26]. These new techniques and ideas combined with 
the pseudo-codeword-search algorithm open interesting new 
opportunities for exploring and improving decoding schemes 
of even longer LDPC codes. 
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